Multi-method Audio-based Retrieval of Multimedia Information

نویسنده

  • MARIO MALCANGI
چکیده

Multimedia information and embedded systems are two major technological advances that have significantly changed the way people interact with systems and information in recent years. In this context, audio proves to be the most advantageous media for interacting with embedded systems and their content. Advantages include: hands-free operation; unattended interaction; and simple, cheap devices for capture and playback. The use of embedded systems to seek information stored locally or on the web points up several difficulties inherent in the nature of multimedia-information signals. These difficulties are especially evident when palmtop or deeply embedded devices are used for such purposes. Developing a set of digital-signalprocessing-based algorithms for extracting audio information is a primary step toward providing user-friendly access to multimedia information and developing powerful communication interfaces. The algorithms aim to extract semantic and syntactic information from audio signals, including voice. Extracted audio features are employed to access information in multimedia databases, as well as to index it. More extensive, higher-level information, such as audio-source identification (speaker identification) and genre (in the case of music), must be extracted from the audio signal. One basic task involves transforming audio into symbols (e.g. music transformed into a score, speech transformed into text) and transcribing symbols into audio (e.g. score transformed into musical audio, text transformed into speech). The purpose is to search for and access any kind of multimedia information by means of audio. To attain these results, digital audio-processing, digital speechprocessing, and soft-computing methods need to be integrated. Neural networks are used as classifiers and fuzzy logic is used for making smart decisions. Key-Words: Audio features, multimedia information, speech-to-text, audio-to-score, text-to-speech, score-toaudio, digital audio processing, pattern matching, soft computing

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Hierarchical Support Vector Machines for Audio Classification *

Audio data is one of typical multimedia data and it contains plenty of information. Audio retrieval is becoming important content in multimedia information retrieval. In multimedia retrieval researches, it becomes more and more important research part how to construct better classifiers for audio classification and retrieval. Support Vector Machines, a novel method of the Pattern Recognition, p...

متن کامل

Prototyping a Vibrato-Aware Query-By-Humming (QBH) Music Information Retrieval System for Mobile Communication Devices: Case of Chromatic Harmonica

Background and Aim: The current research aims at prototyping query-by-humming music information retrieval systems for smart phones. Methods: This multi-method research follows simulation technique from mixed models of the operations research methodology, and the documentary research method, simultaneously. Two chromatic harmonica albums comprised the research population. To achieve the purpose ...

متن کامل

Multi-Modal Retrieval for Multimedia Digital Libraries: Issues, Architecture, and Mechanisms

Supporting effective and efficient retrieval of multimedia data is a challenging problem in building a digital library. In this paper, we examine the issues related to accommodating multi-modal retrieval of multimedia data (text, image, video and audio), and propose 2M2Net as a generic framework for such versatile retrieval in multimedia digital libraries. The retrieval is conducted based on th...

متن کامل

Cipher text only attack on speech time scrambling systems using correction of audio spectrogram

Recently permutation multimedia ciphers were broken in a chosen-plaintext scenario. That attack models a very resourceful adversary which may not always be the case. To show insecurity of these ciphers, we present a cipher-text only attack on speech permutation ciphers. We show inherent redundancies of speech can pave the path for a successful cipher-text only attack. To that end, regularities ...

متن کامل

Intelligent Content-Based Audio Classification and Retrieval for Web Applications

Content-based technology has emerged from the development of multimedia signal processing and wide spread of web application. In this chapter, we discuss the issues involved in the content-based audio classification and retrieval, including spoken document retrieval and music information retrieval. Further, along this direction, we conclude that the emerging audio ontology can be applied in fas...

متن کامل

Automatic Annotation of Formula 1 Races for Content-Based Video Retrieval

Content-based video retrieval is emerging as an important part in the process of utilization of various multimedia documents. In this report we present a novel system for the automatic indexing and content-based retrieval of multimedia documents. We chose the domain of Formula 1 sport videos because the manual annotation of Formula 1 races is complicated and time consuming. Our system uses mult...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010